perm filename CIRCUM.MOR[S80,JMC]3 blob sn#521716 filedate 1980-07-09 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00002 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	.require "memo.pub[let,jmc]" source
C00016 ENDMK
C⊗;
.require "memo.pub[let,jmc]" source
.cb MORE ON NON-MONOTONIC REASONING

.cb John McCarthy


	These notes supplement my %2Circumscription - A Form of
Non-Monotonic Reasoning%1 in press in %2Artificial Intelligence%1
and printed as Stanford Artificial Intelligence Memo AIM-334.
See also NONMON[S80,JMC].

	#. Section 7 - More on Blocks - axiomatizes the conditions for
moving one block onto another.  That formalization reifies (introduces
as individuals of the domain) "things" that may prevent the action.
This reification is unnecessary (though it is may be useful), and
we get a somewhat neater formulation by replacing equation (22) of
that section by

!!a1:	%2∀x y s.(doable(move(x,y),s) ⊃ on(x,y,result(move(x,y),s)))%1,

where we have combined all the conditions for performing an action
into the notion that the action is ⊗doable in a situation.
We can write a general statement that an action has its normal
consequence provided it is doable in the form

!!a2:	%2∀action s.(doable(action,s) ⊃ conseq(action,result(action,s)))%1

and specialize it to the case at hand by writing

!!a3:	%2∀x y s'.(conseq(move(x,y),s') ≡ on(x,y,s'))%1.

	We further have

!!a4:	%2∀x y s.(tooheavy(x,s) ⊃ ¬doable(move(x,y),s))%1

and

!!a5:	%2∀x y s.(¬clear(y,s) ⊃ ¬doable(move(x,y),s))%1.

Circumscribing ⊗¬doable in ({eq a4})∧({eq a5}) gives

!!a6:	%2∀x y s.(doable(move(x,y),s) ≡ ¬tooheavy(x,s) ∧ clear(y,s))%1.

The axioms for actions usually given in AI formulations can be
regarded as obtained by this kind of circumscription from the
more open-ended subjective formulations of common sense knowledge
whose listings of what can prevent an action are subjectively
recognized to be incomplete.
The circumscriptions are done naively by the researcher when he
has to write axioms giving sufficient conditions for actions.

	This is still insufficient reification to permit statements
about when one should circumscribe.

	#. What are the reasoning processes that go to the Amarel
representation of M and C or to a similarly simple representation
of a concrete set of blocks on a table?  The model should be
somehow "constructible" by the reasoning program once we have
decided what facts to take into account.  A crucial step seems
to be the mental act

	"Let the triplet (m,c,b) be the numbers of missionaries,
cannibals and boats on the initial bank of the river in a given
situation".

	#. What is the relation between circumscription and principles of
simplicity and Ockham's razor and all of these to %2ceteris paribus%1?

	#. Circumscriptions are made by minimizing predicates.
Are there corresponding rules for generating Reiter
or McDermott defaults?

	#. More on the propositional case.

	It is informative to consider the general form of a
circumscription of a propositional letter ⊗p in an axiom ⊗A.  We
can always write ⊗A in the form (essentially disjunctive normal form)

!!a7:	%2(a1 ⊃ p) ∧ (a2 ⊃ ¬p)%1

where ⊗a1 and ⊗a2 involve other propositional variables.  The
circumscription is simply

	%2p ≡ a1%1.

Suppose there is another free variable ⊗q, and we want to adjust ⊗q so
as to make ⊗p as false as possible.  Then the axiom takes the form

	%2(a1∧q ∨ a2∧¬q ⊃ p) ∧ (a3∧q ∨ a4∧¬q ⊃ ¬p)%1,

and the result of circumscription is

	%2p ≡ a1∧a2%1.

	The next simplest case is when ⊗p(x) is a unary predicate.
It isn't yet clear whether the occurences of clauses
like

	%2p(x) ∧ ¬p(y)%1

give trouble for circumscription - i.e. results not in accordance
with intuition.

	As for examples, there seems to be no problem about obtaining
the usual S-expression partial ordering by circumscribing < in
the axiom

	%2∀xy.(x < x.y ∧ y < x.y) ∧ ∀xyz.(x < y ∧ y < z ⊃ x < z)%1

Looking at it from another point of view, we get the axiom schema of
LISP induction.

Second thought: Actually there may be a problem in getting the ordering.
Third thought: There is no problem.  We need only plug in the usual
recursive definition of the ordering and show that it satisfies the
schema.
Fourth thought: But how do we show that it is minimal?
Fifth thought: It looks like the  minimization schema
for the program will show that it satisfies the circumscription
schema.

	#. The axiom

!!a9:	%2∀x ∃y.(y > x) ∧ P(y))%1

leads to a contradictory circumscription assuming that the ordering is
irreflexive.  This is because any predicate ⊗P satisfying ({eq a9}) will
be true for an infinite unbounded set of elements of the domain, but it
can be made false at any subset and remain a solution provided what
remains is unbounded.  Thus there is no minimal solution.  This is
probably the paradigm example.

	#. Bayesian reasoning is intrinsically non-monotonic.
Statements of probabilities including conditional probabilities
are supplemented by statements of the outcomes of experiments.
This gives new values for the %2a posteriori%1 probabilities.

	#. Sometimes one makes the default assumption that a quantity
only depends on certain others or perhaps that it depends only on what
it is explicitly stated to depend on.  Sometimes this might be formalized
by asserting that a certain function is of two arguments rather than
three or more.  This won't be so easy.

	#. Another default is that entities with different names are
different.  This is the first where the syntactic form of what is
stated must be taken into account and should serve as a prototype
before dealing with more precise discourse bound defaults.
We can limit the extent to which we must formalize the syntax by supposing
that objects have names and postulating

	%2∀x y.ifposs[x ≠ y ⊃ denotation x ≠ denotation y]%1

or

	%2∀x y u v.ifposs[names(x,u) ∧ names(y,v) ∧ x ≠ y ⊃ u ≠ v]%1,

where ⊗ifposs[<sentence>] abbreviates %2M <sentence> ⊃ <sentence>%1.
We assume that names are expressions, and we can test their equality
directly.
Of course, this also limits what we get out of formalizing the syntax.



References:

%3Reiter, Raymond (1979): "On Reasoning by Default" in %2TINLAP II%1 an
and also %2American Journal of Computational Linguistics%1, 1979,
Microfiche 80.

Perhaps this next should go in a different file, but I can't find it
at the moment.

How to formalize vague concepts

.item←0

	#. Consider "The red dog sees the cat" formalized as

	%2sees(the(red(dog)),the(cat))%1.

We want to avoid committing ourselves as to the exact semantics of
any of the above entities according to the principle that formalizing
common sense concepts should not require knowing more science than
one actually knows, but nevertheless meanings should benefit from
improvements in knowledge.  Thus we wish to preserve some vagueness
in our understanding of what ⊗dog, etc. mean, but still use Tarskian
semantics.

	#. We introduce predicates ⊗true1(sentence, context),
⊗true2(sentence, context) where the ({eq a1}) can serve as the
sentence argument, but the different ⊗true predicates can take
different kinds of entities as ⊗context.  We also have
⊗value1(term,context), etc. in our language, and ⊗value(the(red(dog)), context) 
will depend on the kind of context.   We include contexts of fiction
here or hypothetical contexts.

	#. We are especially interested in what we might want to
say about ⊗the(red(dog)) that would hold for a wide collection
(possibly all) ⊗true predicates and ⊗value functions.

	#. In so far as we don't commit ourselves to specific ⊗true and
⊗value, we are allowing some of the necessary vagueness.  We emphasize
that the vagueness is not an accident of English, or even merely
a necessity of language; the use of vague concepts is essential
for successful thought.

	#. Circumscription and other non-monotonic reasoning goes
from general statements involving the vague concepts to ⊗models 
(in the sense of science or operations research rather than in
the narrow sense of mathematical logic).

	#. The models permit a finitization.  Thus the "cases" that
a model allows can often be explicitly listed.

	#. If we imagine the real situation to be infinitely rich,
we have mappings from finite models into reality (e.g. or perhaps i.e.
by naming) and models from reality into finite models (e.g. by
properties).  Aside to mathematicians: There is an analogy with
the study of topology by singular homology (mappings from polyhedra
into topological spaces) and Cech homology (mappings from topological
spaces into polyhedra).

	#.  Possibly we can usefully treat the two
kinds of mappings separately, but it looks like we have to combine
them to get anything.  Thus when we we speak of "the color of the
dog" ⊗color(the(dog)) it is determined by ⊗value(the(dog),context) which
is an object in the world and by a COLOR mapping from an object
in the world to color names.